3 research outputs found
PoCL-R: An Open Standard Based Offloading Layer for Heterogeneous Multi-Access Edge Computing with Server Side Scalability
We propose a novel computing runtime that exposes remote compute devices via
the cross-vendor open heterogeneous computing standard OpenCL and can execute
compute tasks on the MEC cluster side across multiple servers in a scalable
manner. Intermittent UE connection loss is handled gracefully even if the
device's IP address changes on the way. Network-induced latency is minimized by
transferring data and signaling command completions between remote devices in a
peer-to-peer fashion directly to the target server with a streamlined TCP-based
protocol that yields a command latency of only 60 microseconds on top of
network round-trip latency in synthetic benchmarks. The runtime can utilize
RDMA to speed up inter-server data transfers by an additional 60% compared to
the TCP-based solution. The benefits of the proposed runtime in MEC
applications are demonstrated with a smartphone-based augmented reality
rendering case study. Measurements show up to 19x improvements to frame rate
and 17x improvements to local energy consumption when using the proposed
runtime to offload AR rendering from a smartphone. Scalability to multiple GPU
servers in real-world applications is shown in a computational fluid dynamics
simulation, which scales with the number of servers at roughly 80% efficiency
which is comparable to an MPI port of the same simulation.Comment: 13 pages, 17 figure
PoCL-R : A Scalable Low Latency Distributed OpenCL Runtime
Offloading the most demanding parts of applications to an edge GPU server cluster to save power or improve the result quality is a solution that becomes increasingly realistic with new networking technologies. In order to make such a computing scheme feasible, an application programming layer that can provide both low latency and scalable utilization of remote heterogeneous computing resources is needed. To this end, we propose a latency-optimized scalable distributed heterogeneous computing runtime implementing the standard OpenCL API. In the proposed runtime, network-induced latency is reduced by means of peer-to-peer data transfers and event synchronization as well as a streamlined control protocol implementation. Further improvements can be obtained streaming of source data directly from the producer device to the compute cluster. Compute cluster scalability is improved by distributing the command and event processing responsibilities to remote compute servers. We also show how a simple optional dynamic content size buffer OpenCL extension can significantly speed up applications that utilize variable length data. For evaluation we present a smartphone-based augmented reality rendering case study which, using the runtime, receives 19× improvement in frames per second and 17× improvement in energy per frame when offloading parts of the rendering workload to a nearby GPU server. The remote kernel execution latency overhead of the runtime is only 60 ms on top of the network roundtrip time. The scalability on multi-server multi-GPU clusters is shown with a distributed large matrix multiplication application.acceptedVersionPeer reviewe
ANALYZA – Datový sklad
Softwarová komponenta Datový sklad implementuje úložiště veškerých dat v systému ANALYZA, prakticky uložených ve formě (rozměrných) souborů, objektů a vazeb mezi nimi. Mimo uložení dat určených k analýze představuje Datový sklad i prostor, do kterého je možné ukládat mezivýsledky analytických operací – například v podobě rozšíření stávajících objektů o nové informace, atributy nebo i vytvořením zcela nových objektů či souborů. Hlavní důraz vytvořeného software je kladen na škálovatelnost, spolehlivost, rychlost a flexibilitu celého řešení. Kromě samotného datové skladu obsahuje archiv demonstraci doplňujícího software zajišťujícího propojení více Datových skladů (tzv. proxy komponenta) a demonstraci jejich plnění pomocí uživatelsky přívětivého prostředí.The Data Warehouse software component implements the storage of all data in the ANALZA system, practically stored in the form of (large) files, objects, and links between them. In addition to storing data for analysis, the Data Warehouse also represents a space in which it is possible to store intermediate results of analytical operations – for example, in the form of extending existing objects with new information, attributes, or creating entirely new objects or files. The primary emphasis of the created software is placed on the scalability, reliability, speed, and flexibility of the whole solution. In addition to the data warehouse itself, the archive contains additional demonstrator software ensuring the interconnection of several Data warehouses (proxy component) and demonstrator of data insertion using a user-friendly environment